Homework 4: Contours and Gradients Remember that we used code from people.emich.edu/aross15/math560/nlp-matlab that you might want to refer to. #1: We know that quadratics are useful because smooth objective functions look like quadratics if you zoom in just right. Consider the objective function for Ordinary Least Squares (OLS) regression: sum of the squared residuals, where residual = y data value - (m*x+b) Is this function (sum of squared residuals) exactly quadratic as a function of m & b ? Show symbolically that this objective function is or is not exactly quadratic, using symbolic arguments rather than numeric investigation. #2: Re-do the contour plot using the sum of absolute values, rather than the sum of squares. You might need to adjust: * the resolution of the grid we compute, or * the # or value of contours, or * the range of values that we scan, to get a pretty picture. You may use either the predictions = myslope * xvals + myintercept; or predictions = myslope * (xvals-mean(xvals)) + myintercept; version. If you don't want to use Matlab/Octave, you may use whatever other software suits you. It's even possible to do it in WolframAlpha, though you'd have to enter each residual-square computation individually. Hint: abs(stuff) computes the absolute value of each element in a vector or matrix named "stuff", giving you back a vector or matrix of the same size. For example, abs([3 4 -5 0 -10]) = [3 4 5 0 10] Copy/paste the contour plots into a Word file or the equivalent. Comment on how much the contours are like ellipsoids. #3: Re-do the contour plot using the maximum absolute residual rather than the sum of squares or absolute values. Hint: use max() instead of sum(). Again, you might need to adjust the resolution of our grid or the contours you choose. FYI, minimizing the maximum absolute residual is called "Chebyshev" regression. It is usually a bad idea--it is very susceptible to outliers. Copy/paste the contour plots into a Word file or the equivalent. Comment on how much the contours are like ellipsoids. #4: gradients of Ordinary Least Squares Consider the objective function: sum (y_i - (m*x_i + b))^2 a) compute its gradient symbolically, by hand. b) change the objfunc.m file to include a computation of the gradient in the Ordinary Least Squares case. It will start like this: function [objfuncval,gr] = objfunc(myslope,myintercept,datamat) and then the last line will be something like gr = (whatever formula will compute the gradient) If you want to compute the gradient component-by-component, that is, one formula for the partial with respect to myslope and another for the partial with respect to myintercept, you can do gr(1) = (formula for partial w.r.t. slope) gr(2) = (formula for partial w.r.t. intercept) Note that you don't have to actually run it in Matlab/Octave if you're struggling with them. Just type formulas that they would be able to interpret. Copy/paste those 2 lines of code into a Word file or equivalent. #5 (from Winston, pg 628) The energy used in compressing a gas in 3 stages from an initial pressure "I" to a final pressure "F" is given by the formula K*(sqrt(p1/I) + sqrt(p2/p1) + sqrt(F/p2) - 3) where K is some arbitrary positive constant. Let's use K=7. Formulate an NLP whose solution describes how to minimze the energy used in compressing the gas. a) Using the values I=1 (atmosphere) and F=10, solve the NLP numerically (using Excel Solver or equivalent); attach your Solver file along with your Word file when you submit the homework. b) Based on the optimal p1 and p2 values you got, write down a formula for the optimal p1 and p2 as a function of I and F. This involves some investigation and guessing: try taking a look at p1/I, p2/p1, and F/p2. c) Prove that your conjecture is indeed optimal (take the gradient, etc.) Note: I am asking you to make a guess, then prove the guess is optimal by showing it satisfies the appropriate equation. You could go the other way: take the gradient, and try to solve the equation: gradient=zerovector. I think my way is easier. d) Find the Hessian at the optimal point in part (a); your answer should be a matrix with numbers in it, rather than formulas. You could do this by finite differences, or by taking the Hessian symbolically then plugging in values. #6: (optional) re-do #2 and #3 using a different set of (x,y) data that do not fall perfectly on a straight line. For example, x y 1 1 1 3 5 5 5 7 Do there seem to be multiple optima? You might also try doing contour of the original least-squares objective function. #7: (optional) a Control Theory problem Implement Example 4 (Control) in Chapter 7.3 in the Luenberger textbook in Excel Solver or equivalent. Also try putting different weights on the x variables vs. the u variables. I'm also wondering: is the optimal solution a Proportional control? That is, is u = (some constant)*x at each step? Or is there some other simple rule like that? #8: (optional) an Approximation problem Problem 1 in section 7.10 in the Luenberger textbook: how to approximate a function with a polynomial